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Abstract 

By an efficient algorithm we evaluate exactly the disorder-averaged statistics 
of globally neutral self-avoiding chains with quenched random charge qi = 
±1 in monomer i and nearest neighbor interactions oc q%qj on square (22 
monomers) and cubic (16 monomers) lattices. At the theta transition in 
2D, radius of gyration, entropic and crossover exponents are well compatible 
with the universality class of the corresponding transition of homopolymers. 
Further strong indication of such class comes from direct comparison with the 
corresponding annealed problem. In 3D classical exponents are recovered. 
The percentage of charge sequences leading to folding in a unique ground 
state approaches zero exponentially with the chain length. 

PACS numbers: 64.60.-i; 36.20.-r; 33.15.-e; 02.70.-c. 
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During the last years random heteropolymers have been the object of intense study: they 
play a central role for our understanding of the properties of biologically active molecules 
]1|] and represent a relatively handable model of disordered system, of great interest in 
statistical mechanics Particular attention has been focused recently on randomly 

charged polymers (polyampholytes) []5| interacting via long-range (Coulomb) and short- 
range (screened) potentials. Aminoacids in proteins carry electric charges and electrostatic 
interactions play an important role in determining their behavior fl|]. Thus, polyampholyte 
models are expected to be useful for the investigation of biologically relevant polymers. 

Due to monomer-monomer and monomer-solvent interactions, even in absence of 
charges, a linear polymer undergoes a collapse 0-transition as the temperature T is var- 
ied 0. In particular, in the case of a homopolymer, for T > Tq the chain is swollen and its 
average radius of gyration grows as R g oc N u (N is the number of steps, v = 3/4 in 2D |7j 
and v = 0.588 in 3D ||), while for T <T® the polymer is compact [y = 1/d). At T = T& a 
distinct universality class [y = uq = 4/7 in 2D |J and 1/2 in 3D ||]) characterizes scaling. 

The existence of a collapse transition is by now well established for polyampholyte models 
with both short-range and long-range interactions due to randomly distributed charges. In 
the long-range Coulomb case, the total charge distributed along the chain must not exceed 



a critical value ( AQ ~ v N) in order for the O-transition to occur [JTOj] . In the short-range 



case there is a collapse transition as long as the charge unbalance x = \N + —N^.\/(N + + N_) 



is less than some cutoff value [11]. The collapse of randomly charged polymers owes much 



of its importance to the close connection with protein folding [|T2j and is similar to that of 



homopolymers, with a B-scaling regime separating compact and swollen phases. 

An important, yet unsettled issue is that of establishing whether the B-transition of 
polyampholytes falls in the same universality class as the collapse of homopolymers described 
above. 

By a numerical study of randomly charged self-avoiding walks (SAW) with nearest neigh- 
bor interactions on the square lattice, Kantor and Golding got uq = 0.60 ± 0.02 for 



globally neutral polyampholytes, slightly, but appreciably different from that of homopoly- 



mers (uq = 4/7). These results, based mostly on MC enumerations (up to N = 99) and on 
a relatively limited sampling over disorder, suggest that the presence of quenched random 
interactions can modify the universality class of the 0-transition. This is a quite intriguing 
possibility, also in view of the constant focus on universality issues in the literature on 0- 
transitions A major limitation of this kind of studies is due to the need of performing 
averages over charge disorder, a task which, at an exact level, becomes impossible, with 
standard algorithms, as soon as iV > 15 in 2D. It is not obvious whether, for relatively 
short chains, quenched averages over few disorder configurations (10 -=-100 in ref fl3|]) should 
be sufficient for a satisfactory determination of the radius of gyration. Especially in the low- 
temperature regime, fluctuations of thermodynamic quantities due to quenched disorder are 
indeed very large and random sampling over a small number of disorder realizations can 
lead to inaccurate results for quenched averages. The investigation of the low-temperature 
phase of random polyampholytes is a formidable challenge for MC methods, because of the 
prohibitively large samplings it should require. On the other hand, even thermal averages 
become a problem at low T. Indeed, dynamic MC methods like the pivot algorithm are 
efficient in the high temperature regime but much less reliable as soon as the temperature 
is lowered below T® and chains are compact |TJJ|. Finally, Markov chain based MC meth- 



ods do not allow computation of entropic exponents, which would provide a more complete 
characterization of the universality of the transition. 

The above considerations suggest that an interesting strategy in the study of random 
polyampholyte models can be that of extending as far as possible, by appropriate algorithms, 
the range of chain lengths within which we can still gather exact information. In the present 
work we perform exact enumerations up to chains with N = 21 in 2D and iV = 15 in 3D 
for polyampholytes with charge disorder and nearest neighbor interactions and carry out 
quenched averages over all disorder realizations. To this purpose we developed a powerful 
algorithm for short-range interacting SAW's with charge disorder, in such a way as to re- 
duce by orders of magnitude the required computational effort, compared to more standard 
methods. 



We represent each iV-steps polymer chain configuration by a SAW uo (\u\ = N) on square 
or cubic lattice. In each of the N +1 visited sites sits a charged monomer. The hamiltonian 
takes the form: 



H{{q},u>) 



(1) 



where g» = ±1 is the charge carried by the i-th monomer and indicates pairs of (non- 
consecutive) nearest neighbor monomers. The sequence of the N + 1 charges distributed 
along the chain is denoted by {q} = {q , q 1: q N } and is assumed to be globally neutral 
(N + 1 even). The quenched average squared radius of gyration is: 



R 2 g (N,T) =—j: z {< 1 }( N ^r 1 



q {1} 



E • 

_H=n 



-H{{q},uj)/T r 2 



(2) 



where N q is the total number of charge sequences, r g (u) is the radius of gyration with respect 
to the center of mass of the configuration u, and Z{ q }(N, T) is the canonical partition function 
for a given realization {q} of disorder: 



Z {q} (N,T)= £ e 

\ui\=N 



-H({q},w)/T 



(3) 



Near the 0-point the radius of gyration is expected to obey the tricritical scaling form 



15,16 : 



R 2 JN,T) ~ N^fiN^r), 



(4) 



where r = (T — Tq)/Tq is the temperature distance from the 0-transition and 4>q is the 
crossover exponent. In the case of homopolymers in 2D, uq = 4/7 and 0e — 3/7 0. 
The annealed partition function is defined as the average of (0) over all sequences: 



£ (a) (A^T) = -L$> w (Ar,T) 

iVg {Q} 

while the quenched partition function is defined in terms of the quenched free energy: 

1 



(5) 



Z (q) ( N i T ) = eX P 



J2\og(Z {q] (N,T)) 



{?} 



(6) 
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Both annealed and quenched partition functions are expected to scale as: 

z (a/a ,(jv,r)^/*(«/„)(r)^c^co-i. (7) 

The connectivity \x is lattice and temperature dependent, while 7 is the universal entropic 
exponent. Like in the homopolymer case, for high T the exponent j(T) should identically 
take the value appropriate to SAW's in the swollen regime (jsaw = 43/32) f7j. In presence 
of a 6-collapse, at Tq, 7 is expected to assume a different value, 70, which is peculiar of the 
universality class of the transition. At lower temperatures, because of the globular shapes 
of the collapsed polymer, surface effects are present and can modify the above scaling form 
for Z with the appearance of an extra exponential factor, besides /i [ |17|j . 

It is not obvious, a priori, that [i and 7 should take the same values for the quenched and 
annealed problems. While this is plausible at relatively high T, where quenched disorder 
plays a minor role, discrepancies can be anticipated at low T. A main issue here is to establish 
whether T© is still included in the high-T range. 

Different entropic exponents have to be defined when polymers are subject to geometrical 
constraints: if the polymer chain is forced to live in half space by an impenetrable wall to 
which one of its ends is fixed, the critical entropic exponent assumes a value 71, different 
from the bulk 7 ||. The entropic behavior of SAW's at boundaries already played a major 
role in studies aimed at a precise characterization of the universality class of the 0-transition 
of homopolymers (in 2D: 7© = 8/7, 71© = 4/7, 7n© = -4/7 |8|,[19|). 

The numerical study of entropic exponents is greatly facilitated by considering simulta- 
neously data for bulk and boundary behavior pOf . Indeed, if the polymer is not adsorbed, 



the connectivity /x is insensitive to the presence of boundary and remains the same for both 
bulk and boundary behavior of Z. Below we indicate by Z bulk and Z hal f the respective 
partition functions. Thus, the ratio between bulk and boundary Z's scales as a power of the 
difference between the respective 7's and does not depend on /i, whose estimation is then 
not necessary. Due to these circumstances, the determination of 7 — 71 gets easier and much 
more accurate. 



Here we call contact a pair of non-consecutive monomers on nearest-neighbor sites, i.e. 
two interacting monomers. The contact map of a given SAW configuration u is the set of 
all contacts it contains: 

X(u) = : \u(i) - u(j)\ = 1,1* — j\ > 1} • (8) 

A contact map of an iV-steps SAW can also be represented by an (iV+1) x (JV+1) matrix, 
whose element is 1 or 0, according to whether the monomers % and j are interacting or 
not , respectively. For any given {q}, the energy of a configuration uo is fully determined by its 
contact map X(uj). Two configurations u), u' which are characterized by the same structure 
of contacts (X(u) = X(u')), have the same energy for every {q}, and can be considered as 
equivalent. The set of all cu's of a given length can be partitioned into equivalence classes, 
each of them containing all the walks which are characterized by a given contact map. The 
number of equivalence classes is equal to the number, Sn, of distinguishable contact maps 
X a , a = 1, .., Sn- Each equivalence class C a = {u : X{uj) = X a } is characterized by its own 
degeneracy g(a) and cumulative squared radius of gyration p 2 g (ot): 

g{oc) = E i; p» = E (9) 

g(a) is expected to grow exponentially with the difference between the number of steps iV 
and the number of contacts in X a |[21|| . This means that Sn still grows exponentialy with 



N, but much more slowly than the total number of SAWs Cn (see Table 1). In particular 
the ratio Sn/Cn is expected to approach zero exponentially. 

The sum over configurations u in (0) and (|3]) can be replaced by the sum over equiva- 
lence classes, each of them taken with its own degeneracy and cumulative squared radius of 
gyration: 



R 2 g (N,T) = ^Y,Z {q} (N,T)- 1 



Sn 

E e- H ^ x ^ T p 2 g (a) 

a=l 



(10) 



5jv 

Z {q} (N,T) = J2e- H ^> x "V T g(a). (11) 

a=l 



In terms of computational cost, the last equations are considerably cheaper than eqs. (0) 
and (0). The main improvement regards the thermal averages over configurations u, which 
are made considerably faster, due to the fact that they involve summations over Sn rather 
than Cn terms. Detailed enumeration of all cu's for each given sequence {q}, would become 
unfeasible as soon as iV > 15, when computing exact averages over disorder. 

In the present work, eqs. (|lOl) and flTlj) have been implemented by an efficient algorithm, 
in which SAW's of a given length are generated once for all. The structure of contacts of 
each walk is registered on a binary map. Whenever a new walk is generated, its contact map 
is analyzed and sorted: if the contact configuration has already occured, its degeneracy and 
cumulative gyration radius are updated, otherwise a new contact map is added. 

Once SAW's are fully enumerated, all contact maps X a are stored together with their 
g(a) and p 2 g (a). Disorder-averages of thermodynamic and geometric observables are then 
calculated over half the number of neutral sequences, being the hamiltonian invariant under 

{?} - {-<?}■ 

On a DEC 600 DIGITAL workstation exact enumeration of SAW's and complete quench- 
ing over all sequences require few minutes of CPU time for 16 monomer chains, about 130 
hours for 22 monomer chains. 

The same algorithm was later on adapted in order to compute annealed averages over 
the same realizations of disorder. The computation of annealed averages is slightly faster 
than that of quenched averages. Thus, we could easily obtain exact results with annealed 
disorder for N up to 21 in 2D. 

In order to study the 9-point we computed effective v exponents: 



v(N,K,T) = ± log 



Rl(N,T) 



log 



N 



N-K 



(12) 



R* g (N-K,T)_ 

In the N — > oo limit these curves should be step functions of T. However, at finite 
N, they show a rounded step. If the trends of approach of the N — > oo v values in the 
high T and low T phases are from opposite directions, the curves ([T^) are expected to 
intersect among themselves in the neighborhood of the O-point. They indeed show such 



behavior: effective exponents like v(N, 2, T) are monotonically increasing functions of N at 
high T and decreasing at not too low T. Linear extrapolation of these curves with respect 
to l/N, in the 1/N — > limit, allows to estimate an exponent v oa {T') which is close to or 
even below the compact-polymer value v = 0.5 for T just below the intersection region. 
On the other hand, ^oo(T) is almost equal to the swollen SAW value v = 0.75 at high T 
(Fig.l). Intersections of all the curves v(N, K, T) occur in a small region of the (T, v) plane, 
within which one can suppose the transition to be located. Following Privman [p2]l , in order 
to quantitatively pinpoint the G-transition, we calculated the coordinates (Tj„ t , h>i nt ) of all 
intersections between every pair of curves u(N, K, T) , v(N', K', T), and plotted these points 
against 1/N e ff = 2/(N+N'). The definition of N e ff is of course subjective. In our choice no 
role is played by the integers K and K' because of the weak dependence of the intersection 
locations on these parameters. For each 1/N e ff we computed the mean of T int and v int of the 
corresponding intersections and extrapolated them linearly as a function of 1/N e ff (Fig. 2) 
obtaining the estimates : v® = 0.58 ± 0.02 and Te = 0.80 ± 0.03. Uncertainty estimates are 
also based on comparison between extrapolations from data in different ranges of 1/N e ff. 
The exponent is fully compatible with homopolymer B-point universality. 

Another method can be applied in order to estimate vq and T@. As illustrated above, 
the effective exponents u^(T) = u(N,2,T) are monotonic functions of 1/N, decreasing for 
T > Te and increasing for T < T@ . Their linear correlation with respect of 1/N can be 
analyzed with the correlation coefficient defined by p3| : 



r(T) = Z N (l/N-l/N)(u N (T)-u N (T)) 

\ZJ2n(1/N - TJN) 2 Y.n(vn(T) - MT)7' 
where bars indicate averages over N. The coefficient r(T) is close to —1 for T > Tq and to 
1 for T < Te meaning that, in these regions, data are very well linear correlated and have 
opposite monotony. In the region r(T) undergoes a sudden jump between 1 and —1. Its 
derivative with respect to temperature shows a high and sharp peak whose mean value and 
width localize Te and determine its uncertainty ATe- The extremal values taken by ^oo(T) 
in the interval [Tq — ATe, Tq + ATe] give an estimate of vq and of the corresponding error 
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Auq. T@ and Uq obtained with this method are almost identical to the values determined 
above by extrapolating the intersections v int and T int , respectively. 

In order to obtain the crossover exponent <ft@ we analyzed the derivative of the squared 
radius of gyration with respect to temperature. Near the 0-point, this quantity should scale 
as: 

^R 2 g (N,T) ~ N *e<X)+MT). ( 14 ) 

The effective exponent curves corresponding to <fi + 2v do not clearly intersect each other 
in a narrow region of the (T, <ft + 2v) plane. So, the method used for determining v® cannot 
be applied in this case, because it would lead to ambiguos results. Following ref. |p2| , we 
then calculated, for each intersection (T int ,is int ) between u(N,K,T) and v(N' , K' ,T), with 
N > N', the quantity: 



log 



dR 2 g (N,T mt )/dT 
dR 2 (N-K,T int )/dT, 



log 



N 



1 -l 



N — K 



ZVint- (15) 



Extrapolation of these data in 1/N e ff leads to the estimate: 0e — 0.40 ± 0.08 (Fig.3). 
The computation of the crossover exponent for homopolymer 0-transitions is usually rather 
difficult and often leads to considerable overestimates P^P^fl . Our result is well compatible 
with the exact homopolymer value 0e = 3/7 ~ 0.42.. [g]. Attempts to determine <fi on 
the basis of data collapse fits for R g (eq.(|)) were not very successful because the collapse 
quality does not depend sensibly enough on this exponent. 

In the annealed system, frustration effects peculiar of quenched disorder are ruled out. 
The charges distributed along the chain are indeed free to rearrange among themselves in 
such a way to let nearest-neighbor interactions able to minimize the energy of each SAW 
configuration lu. It seems very plausible that such a rearrangement can produce a collapse 
in the same universality class as the 0-point of an ordered polymer with nearest neigh- 
bor attractive interactions for all monomers. Because of these reasons we expect annealed 
disorder to be irrelevant for the collapse transition. This conjecture is well confirmed by 
the analysis of our exact enumeration results for the annealed system (22 monomers). The 



analysis followed the lines of the quenched case. The transition exponents of the annealed 
model were estimated as v® = 0.58 ± 0.02 and 0e = 0.41 ± 0.08. 

A direct comparison between annealed and quenched entropies turns then out to be a very 
significant test, in view of the fact that the annealed system represent a sort of substitute 
of the pure one. As explained above, to avoid difficulties due to the calculation of the non- 
universal constant /x, in the case of both quenched and annealed charges, we analyzed the 
ratio Z between the partitions of SAW's in the bulk and in presence of boundary, which is 
expected to scale as: 



The sequences (7 — 7i)( / g ) (N,T), plotted against 1/N, show remarkably good linear 
correlation. Their extrapolation for 1/N — > gives a reasonable estimate of the expected 
(7 — 7i)(o/g) i n the high-T range and close to Tq. Even more precise is the comparison 
between the annealed and quenched cases based on these 7 — 71 estimates. It turns out 
that the difference 7 — 71 is almost identical for annealed and quenched systems on a range 
of temperatures which clearly extends below the 0-temperature (Fig. 4). We estimated 
(7 ~ 7i) ~ 0.50 and (7 — 71) ~ 0.39 at the G-point and in the high T region, respectively. 
The O-point determination is slightly below the homopolymer value ((7 — 71)0 =4/7 [|i~8|j), 
while the high T one is almost coinciding with the SAW one: ((7 — Ji)saw = 25/64 ||). 

In 3-D, for a homopolymer, i/q is expected to be equal to 1/2 with logarithmic corrections 
||. Indeed d — 3 is the upper critical dimension for the transition. We applied our methods 
to our model of random polyampholytes in 3D and computed exact averages for chains up 
to 15 monomers. A simple analysis of the radius of gyration, not including logarithmic 
corrections, gives v® = 0.51 ± 0.04, again consistent with the homopolymer universality 
class. 



Z(a/ q )(N,T) 



Z* a %(N,T) 



jv(^W T ). 



(16) 



Effective exponents can be obtained from: 




(17) 
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Altogether the above results give very strong evidence that the collapse transition of 
the globally neutral random polyampholyte model falls in the same universality class as 
the 0-point of homopolymers. Support to such a conclusion comes from the exponent 
determinations we were able to perform. Further evidence comes from our comparative 
analysis of entropic properties in the case of quenched and annealed disorders. Our study of 
7 — 71 shows that annealed and quenched partition functions start to deviate appreciably at 
some temperature falling definitively below the estimated T&. In order to obtain a collapse 
with exponents different from those of homopolymer models one should have conditions such 
that the effect of quenched disorder become important above, or, at least, at the collapse 
transition temperature. The identification and investigation of models where such conditions 
could possibly be realized remains an important open issue in the field, whose solution would 
sensibly increment our understanding of the possible role played by chain disorder in polymer 
statistics. 

With the exact enumeration methods developed for the study of the B-transition we 
could also perform an analysis of how the actual partition function at fixed {q}, Zi g \, 
deviates from its (annealed) average at low temperature. Histograms of quantities like 
Z{ q }(N,T)/Z( a }(N,T) show very clearly a lack of self- averaging at T sufficiently lower than 
Tq. While in a range of high T including Tq they are narrow peaked around the value 1, 
for lower temperature they are quite broad. At very low T such histograms acquire a sparse 
structure and allow to investigate folding properties of the model. 

While for T approaching zero homopolymers collapse to many compact conformations 
with the same ground state energy, most heteropolymer sequences usually collapse to very 
few lowest-energy conformations (see e.g. |p6|j ). In general, in order to well represent proper- 
ties of real proteins, a heteropolymer model is expected to admit a unique compact confor- 
mation with lowest-energy, i.e. a non-degenerate ground state, at least for some sequences. 
The percentage of sequences admitting a unique ground-state for the H-P heteropolymer 
model is believed from numerical analysis to remain almost constant as N increases [fZ7| . 



The H-P model is a SAW in which each monomer can have either a hydrophobic or a polar 
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character, with short range interactions to the nearest neighbor solvent molecules. This 
model has been often applied to protein folding studies (see e.g. p6|j28[] ). Here we investi- 
gated the number Jn of sequences having a unique "native state" in our 2D model. This 
analysis was performed by applying the exact method described above to the investigation 
of ground states of Hamiltonian walks |2!| on the square lattice, for chain lengths up to 25. 
It turns out that /jv grows with N at a reduced exponential rate with respect to the total 
number of sequences N q . In particular we found: fx — 1.85^ while N q ~ N~ 1 ^ 2 2 N . Thus, 
the percentage of sequences which possess a unique ground state tends asymptotically to 
zero as N — > oo. This behavior is in sharp contrast with that found in the H-P model |27| . 

We thank F. Seno for useful discussions and help in the analysis. We are also indebted 
with S. G. Whittington for discussions and with C. Vanderzande for a critical reading of the 
manuscript. Partial support from the European Network Contract ERBFMRXCT 980183 
is also acknowledged. 
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FIGURE CAPTIONS 



Fig 1: Effective v(N,K = 2, T) exponents (solid lines) and their linear extrapolation ^(T) 
for 1/N — > (dot-dashed line). Temperature is normalized to monomer-monomer interac- 
tion. For sufficiently low T the sequences cease to be monotonic. Of course, the relatively 
short length of the chains rounds off the expected step-like shape of at the 0-transition. 

Fig 2: Values of u int as a function of 1/N e ff. Rhombs indicate the means of the exponent 
extimates at fixed N e ff , while horizontal bars limit the variance of their distribution. 

Fig 3: Extrapolation of the crossover exponent 00. 

Fig 4: Extrapolation of 7 — 71 for annealed (dashed line) and quenched (dot-dashed line) 
disorder. The values are almost identical in a range of temperatures extending below T ~ 
0.80. 

Tab 1: Comparison between the number of different contact maps Sn and in 2D for 
N — 7, 8, .., 21. Also even values of iV are reported for completeness. 
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